Data storage using spreadsheet and metatags

ABSTRACT

Metatag identifiers are stored in a spreadsheet, and are made available for use in metatagging various files. Steps may include: identifying the item of data in a document; activating an activation code; providing a listing of metatag choices; selecting a metatag from the listing of metatag choices; identifying the selected metatag with a column in the spreadsheet; and storing at least a portion of the item of data in a cell of the column. The listing is preferably a visually displayed listing, and selection can be made by clicking. Data previously associated with a metatag, and stored in the spreadsheet can be displayed as values. The values for a given metatag can thus be sorted and listed, providing immediate feedback to a user to assist in determining the propriety of a particular metatag.

FIELD OF THE INVENTION

[0001] The field of the invention is computer software, especially datastorage technologies.

BACKGROUND OF THE INVENTION

[0002] There are numerous different types of databases. Onecharacteristic defining most such databases, however, is the use of astructured method of storing the data. A flat table type of database,for example, is typically stored as a data stream with markers thatdesignate columns and rows. In display of such databases the rowstypically correspond to records, and the columns correspond to fields.

[0003] Spreadsheets can be thought of as a species of flat tabledatabases, in which the display of the database is used as the sole, orat least a primary, data entry interface. Indeed, it is known to readilyexchange data between spreadsheets such as Microsoft™ Excel™ anddatabases such as Microsoft™ Access™. One difference is thatspreadsheets typically include the column identification information incells of the first row, whereas databases typically include the columnidentification information in a header that does not appear to a user tobe mixed in with the data.

[0004] With heightened need to provide simplified connectivity amongdifferent platforms and different types of databases, a new form ofdatabase that relies on metatags to identify data rather than logicalstructure. The term “metatag” is used herein to mean an identifier of atype of data content that may be used repeatedly in a document toidentify multiple occurrences of data having the particular type of datacontent. Thus, a metatag may be employed to identify data having datacontent relating to “price”, “description”, or “product number”, and mayeven use those exact words as the metatag names. An electronic copy of aletter to a customer may then be “tagged” by blocking a price with theletter, hitting a special activation code, and then typing the word“price”. Corresponding data within the letter can be tagged in a similarmanner to identify description and product number. It is also known tonest the metatagged data, such as by blocking together the price,description, and product number, and identifying all three items of datawith the metatag such as “product”. Microsoft™ XML™ is an exemplaryproprietary language that employs metatags to store data. XML™ isdescribed in numerous publications, including McLaughlin, Brett, Java &XML, 2nd Edition: Solutions to Real-World Problems, O'Reilly &Associates; September, 2001, ISBN: 0596001975, and Harold, ElliotteRusty and W. Scott Means, Scott W., XML in a Nutshell: A Desktop QuickReference (Nutshell Handbook), Jan. 15, 2001, O'Reilly & Associates;ISBN: 0596000588, both of which is incorporated by reference herein.

[0005] A major advantage of metatagged data (also referred to as“tagged” data for simplicity herein) is that such data can be properlyidentified within substantially any type of document, from an image fileto a text file, regardless of the structure of the file. In suchinstances the tagged data items are readily interspersed amongnon-tagged data by delimiters that correlate a metatag with itsassociated data. Another advantage is that data storage space is notwasted on cells for which there is no data. In an ordinary flat databasehaving 20 records and 7 fields, for example, a database system mayallocate space for 20*7=140 cells. If only 50 of those cells have data,then 50% of the storage space is wasted. With a metatagged file, thereare no unused cells because there are no cells at all.

[0006] Conversely, if the metatags are excessively repeated within adocument, the use of metatags can also produce inefficiency. If thereare 1000 occurrences of each of three data types, it would be much moreefficient to store the data in a flat database having 1000 records of 3fields than using a metatagged structure. The flat database would storethe field names only once, but the metatagged structure would store oneor another of the metatag names 3000 times.

[0007] There is also the problem of knowing which metatags can or shouldbe used to tag data within a document. This is potentially an ongoingproblem for those involved in metatagging documents, precisely becausemetatags they are not limited to a fixed set of names, and there are asyet no generally accepted metatag naming conventions. This is nottypically a problem for other types of databases because the personsetting up the data base already set a relatively fixed list of possibledata fields, and those entering data are limited to those pre-setfields. Moreover, the literal naming of the fields is usually irrelevantto a typical user.

[0008] There is still the further problem that tagged data in metataggedfiles can be difficult to visualize. For example, spreadsheet data isvery readily visualized in the well-known column and row format, inwhich each column effectively stores data for different types of datacontent. Not only can data in adjacent cells be visually compared, butmathematical functions on data in one row are readily ported to otherrows, and mathematical functions on data in one column are readilyported to other columns. All of these things can be very difficult inmetatagged documents because data for any given type of content (datathat would be listed in the same column of a spreadsheet), can belocated all over a document.

[0009] It is thus interesting that spreadsheets and metatagged fileshave advantages and disadvantages that are to a large extentcomplimentary. This fact either does not seem to have been appreciatedby others, or they have not developed a solution to take properadvantage of such complimentarity. Thus, there is a need for methods andembodiments that advantageously combine features of spreadsheets for usewith metatags.

SUMMARY OF THE INVENTION

[0010] Methods and software are provided in which metatag identifiersare stored in a spreadsheet, and are made available for use inmetatagging various files.

[0011] In one aspect of the invention an item of data is stored in aspreadsheet using the steps of: identifying the item of data in adocument; activating an activation code; providing a listing of metatagchoices; selecting a metatag from the listing of metatag choices;identifying the selected metatag with a column in the spreadsheet; andstoring at least a portion of the item of data in a cell of the column.

[0012] Virtually any document can be metatagged. Preferred embodimentsinclude tagging a text document such as a Microsoft™ Word™ document, oran HTML document such as an Internet Web page. Identifying the item ofdata can occur in any suitable fashion, including blocking the item ofdata. Presumably the entire blocked data would be tagged, but a subset(i.e. a portion) could alternatively be tagged. The activation code ispreferably a right mouse click operation, but can be any key orcombination of keys such as control-shift-m. Verbal or othernon-keyboard commands for blocking or activating are also contemplated.

[0013] The listing is preferably a visually displayed listing,presumably on a computer screen or other display device. Sorted listsare preferable, and selection can be readily made by clicking. Hereagain, however, any suitable expression means is contemplated, includingverbal reading of choices.

[0014] Multiple pages of the spreadsheet can be employed, with both dataand metatag names being stored on the same page or on different pages.In the first case the metatag names are probably best stored in thefirst row of the spreadsheet. In the document, the metatags arepreferably stored proximally to the data being tagged.

[0015] It is also contemplated that data previously associated with ametatag, and stored in the spreadsheet can be displayed as values. Thevalues for a given metatag can thus be sorted and listed, providingimmediate feedback to a user to assist in determining the propriety of aparticular metatag. It is still further contemplated that differentpages of the spreadsheet can be used to store different sets of data,with the metatags differing at least in part from one page to another. Aclassification index can be derived that assists a user in determiningthe propriety of a classification.

[0016] Various objects, features, aspects and advantages of the presentinvention will become more apparent from the following detaileddescription of preferred embodiments of the invention, along with theaccompanying drawings in which like numerals represent like components.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017]FIG. 1 is a schematic showing correspondences among a spreadsheet,a metatagged document, and various supporting listings.

[0018]FIG. 2 is a flowchart of steps in a preferred embodiment.

DETAILED DESCRIPTION

[0019]FIG. 1 generally depicts a spreadsheet 10, a text document 110, ametatag listing 210, and a values listing 310.

[0020] Spreadsheet 10 generally includes a plurality of columns 21-25,and a plurality of rows 31-37. Cells occur at the intersection of thecolumns and rows, including cells 41 and 42 at the intersection ofcolumns 21 and 22 with and row 31, respectively, cells 43-44 at theintersections of columns 21-22 and row 32, respectively, and cells 45-46at the intersections of column 21 with rows 33-34, respectively.Notwithstanding the drawing, it should be appreciated that spreadsheet10 is exemplary only, and all sizes and configurations of possiblespreadsheets are contemplated. Cells 41, 42 contain metatag names.

[0021] Document 110 is preferably a Microsoft™ Word™ file, but may beWordPerfect™ or any other type of text document, or even a non-textdocument. Especially contemplated are other types of documents includingPowerPoint™ presentations, web pages, and so on. Document 110 maytherefore have any suitable content, including personal or businessletters, advertisements, formal or informal notes, and images and othertypes of images.

[0022] In the particular example of FIG. 1, item 121 is text or othermaterial that is of no particular consequence to the discussion herein,and is therefore shown merely as squiggled lines. Items 122-125 are dataitems that are stored in cells 43-46, respectively.

[0023] Right clicking on data item 125 causes the software to produce ametatag listing 210 as depicted by arrow 150. In this particularexample, the listing would contain the metatag names 221-225 stored asdata in the cells of row 31. Metatag listing 210 is preferably sortedalphabetically as shown, but may be sorted in any other suitable mannersuch as by recency of use or relative frequency or use, or may beentirely unsorted. Drop down navigation button 261 advantageouslyaccesses the various options. The display of metatag listing 210 alsoincludes navigation button 262 that closes the display. Vertical orhorizontal sliders (not shown) may also be included.

[0024] Right clicking on any of the metatag names in metatag listing 210produces an optional values listing as depicted by arrow 152. In thisinstance, right clicking on metatag name 224 corresponds to the namestored in cell 41 (due to the sorting), and produce a values listing 310comprising a list 320 of all the data items in the cells of column 21,including data in cells 43, 45, and 46. The display of values listing310 also include navigation button 361 that access sort functions, andnavigation button 361 that closes the display. Vertical or horizontalsliders (not shown) may also be included.

[0025]FIG. 2 depicts steps 410-460 that are preferably embodied insoftware 400. In practice, one would likely store software 400 in theinternal memory (not shown) or the mass storage (not shown) of acomputer system (not shown). Alternatively, the software or portions ofit could be accessed as needed from the Internet or other network. Insome or all of these steps the “user” is generally a human user. It is,however, also contemplated that the user can be an electronic entity,such as a computer program or virtual robot.

[0026] Step 410 involves identifying the item of data in a document.Here the user identifies what data is to be metatagged. This ispreferably accomplished by blocking, which in the Microsoft™ world cancurrently be accomplished by typing a special code (currently F8), andusing the arrow keys. Another current method is to hold down the leftmouse button while “dragging” the cursor across area to be blocked.Still another method, which is often used in spreadsheets, is to doubleclick on a cell of the spreadsheet. Non-Microsoft™ systems may haveother corresponding methods of accomplishing identification of data, andall such methods are contemplated. It is specifically contemplated thatnon-contiguous data can be blocked or otherwise identified, and broughttogether as a single set of data to be metatagged.

[0027] Step 420 involves activating an activation code. The software 400would be likely activated by a hot key such as a right mouse click,although it should be appreciated that the term “right clicking” as usedherein can be substituted by any number of other software accessingcodes, including keystrokes and combinations of keystrokes.

[0028] Step 430 involves providing a listing of metatag choices. Such alisting can be derived from any number of sources. In the example ofFIG. 1, the various metatags are stored in the cells of the first row 31of the first sheet of the spreadsheet 10. In other embodiments themetatags may be stored on another sheet of the same spreadsheet 10, orin an entirely different file such as a database file.

[0029] There are numerous advantages to cross-utilizing metatags amongmany different users, and it is especially contemplated that collectionsof metatags can be made available across the Internet or some othershared electronic resource. Among other things it is contemplated thatcross-utilization would encourage consistency among metatag databases,which ultimately can be extremely useful if any such databases need tobe combined or accessed as a unit. In a favored embodiment, a web pageor other executable public file could analyze text, keywords, or otherdata from the document being metatagged. The result of such analysiscould then be used to select a subset of potentially useful metatags.For example, if a person is metatagging a web page that refers toautomobiles, the system may select a subset of metatags having to dowith automobiles, including for example make, model, year, color,condition, price. On the other hand if a person is metatagging a webpage that refers to bananas, the system may select a subset of metatagshaving to do with fruit, including for example source country, producer,weight per box, price per box, ripeness, and so forth.

[0030] It is also contemplated that subsets of metatags may be chosenbased upon a group tag. In FIG. 1, for example, bracket pairs 126A, 126Bare symbolic of group metatags used to designate that data items 122 and123 are related to each other, and that data items 124 and 125 arerelated to each other. Contemplated designations for group metatagsinclude classifications such as automobiles, boats, real estate, foods,attorneys, and so forth. Such designations may advantageously be storedin rows along with associated data. Thus, the literal for the groupdesignation of data items 122 and 123 is stored in the cell 48, and theliteral for the group designation of data items 122 and 123 ispreferably stored in the cell 49. Cell 47 preferably includes a metatagliteral such as “Group”.

[0031] It is still further contemplated that subsets of metatags may bechosen based upon those metatags that have already been utilized in thedocument. Thus, in further metatagging of the document 10 of FIG. 1, apreferred method would be to recognize that group metatags literals arestored in cells 48, 49, and that individual metatag literals are storedin cells 41, 42. Any or all of that information could be used to locatea subset of perhaps forty to fifty other metatag literals that arecommonly used in conjunction with these metatag literals. Search forsuch a subset can be performed using a local or networked database orother resource.

[0032] The listing of metatag choices is preferably sortedalphabetically as discussed above with respect to list 210, and ispreferably provided to a user using a CRT, laptop screen, or othervisual type display. Alternatively, however, the listing can be providedby any other suitable means, such as by performing an audible reading ofthe list, or printing a paper copy of the listing.

[0033] Step 440 involves selecting a metatag from the listing of metatagchoices. In this instance a user examines the listing of possiblemetatag choices provided in step 430, and selects an appropriate metatagfor the data being tagged. Selection can be accomplished by any suitablemethod, including clicking on a particular choice within a visualdisplay, audibly stating the choice, and so forth. Where there are nosuitable choices, the system may allow the user to enter a new metatag,which can then be made available to others.

[0034] Step 450 involves identifying the selected metatag with a columnin the spreadsheet. This step can be performed manually for very smallspreadsheets, but should be performed automatically for any spreadsheetof substantial size. One method of accomplishing the identification isthrough a standard search command, preferably searching only those cellsof the spreadsheet that are likely to contain a literal matching theselected metatag. Thus, if the metatags are stored in cells of the firstrow of the spreadsheet, as in FIG. 1, it is desirable if the search isdirected to the first row only.

[0035] Step 460 involves storing at least a portion of the item of datain a cell of the column. Once the column containing the selected metatagis identified, that column is used to store the earlier identified itemof data. If this is the first item of data for a group, or if there isno group, then the item of data can be stored in a blank row. Ifpreviously associated items of data have already been stored in thespreadsheet for a given group, then the new item of data should bestored in the same row as the previously stored data, but in therecently identified column.

[0036] Although it is likely that the entire item of data will be storedin the spreadsheet as just discussed, it is also possible that only asubset of the item of data will be stored. This may occur for severalreasons, including oversize of the item. In such instances the data maybe truncated, with or without providing a warning of the same to theuser. Another reason for storing less than the entire item of datainclude a desire to eliminate undesirable words or phrases from thedatabase.

[0037] Thus, specific embodiments and applications of data storage usingspreadsheets and metatags have been disclosed. It should be apparent,however, to those skilled in the art that many more modificationsbesides those already described are possible without departing from theinventive concepts herein. The inventive subject matter, therefore, isnot to be restricted except in the spirit of the appended claims.Moreover, in interpreting both the specification and the claims, allterms should be interpreted in the broadest possible manner consistentwith the context. In particular, the terms “comprises” and “comprising”should be interpreted as referring to elements, components, or steps ina non-exclusive manner, indicating that the referenced elements,components, or steps may be present, or utilized, or combined with otherelements, components, or steps that are not expressly referenced.

What is claimed is:
 1. A method of storing an item of data in aspreadsheet, comprising: identifying the item of data in a document;activating an activation code; providing a listing of metatag choices;selecting a metatag from the listing of metatag choices; identifying theselected metatag with a column in the spreadsheet; and storing at leasta portion of the item of data in a cell of the column.
 2. The method ofclaim 1 wherein the document comprises a text document.
 3. The method ofclaim 1 wherein the document comprises a HTML document.
 4. The method ofclaim 1 wherein the step of identifying the item of data comprisesblocking the item of data.
 5. The method of claim 1 wherein the step ofactivating the activation code comprises entering a predeterminedactivation key combination.
 6. The method of claim 1 wherein the step ofproviding a listing comprises visually displaying the listing.
 7. Themethod of claim 1 wherein the step of selecting the metatag from thelisting of metatag choices comprises clicking on the selected metatag.8. The method of claim 1 further comprising storing the selected metatagin a cell of the column in a first row of the spreadsheet.
 9. The methodof claim 1 further storing an identifier related to the selected metatagproximally with the at least a portion of the item of data in thedocument.
 10. The method of claim 1 wherein the cell in which the atleast a portion of the item of data is included in one page of thespreadsheet, and the selected metatag is stored in another page of thespreadsheet different from the first page.
 11. The method of claim 1wherein the step of providing the listing of metatag choices comprises:determining a plurality of metatags previously associated with aplurality of columns of the spreadsheet, respectively; sorting theplurality of metatags into a sorted set; and displaying at least asubset of the sorted set of metatags.
 12. The method of claim 1 furthercomprising: determining a plurality of values previously associated withthe selected metatag; sorting the plurality of values into a sorted set;and displaying at least a subset of the sorted set of values.
 13. Themethod of claim 1 further comprising: storing the spreadsheet in atleast a first page and a second page, wherein the first page has a firstplurality of columns associated with a first set of metatags, and thesecond page has second plurality of columns associated with a second setof metatags that is not identical with the first set of metatags; andproviding a classification index that assists a user in selecting one ofthe first and second pages in which to store the item of data; whereinthe listing of metatag choices is determined from the set of metatagsassociated with the selected page; and the column in which the item ofdata is stored is one of the plurality of columns of the selected page.14. A computer software having code that cooperates with a computer tostore data in a spreadsheet by executing the steps of: detectingidentification of the item of data in a document; detecting entry of anactivation code; determining a plurality of metatags previouslyassociated with a plurality of columns of the spreadsheet, respectively;sorting the plurality of metatags into a sorted set; displaying at leasta subset of the sorted set of metatags; detecting selection of a metatagfrom the displayed subset of metatags; identifying the selected metatagwith a column in the spreadsheet; and storing at least a portion of theitem of data in a cell of the column.
 14. The software of claim 14further comprising storing an identifier related to the selected metatagproximally with the at least a portion of the item of data in thedocument.
 15. The software of claim 14 further comprising: determining aplurality of values previously associated with the selected metatag;sorting the plurality of values into a sorted set; and displaying atleast a subset of the sorted set of values.
 16. The software of claim 14further comprising: storing the spreadsheet in at least a first page anda second page, wherein the first page has a first plurality of columnsassociated with a first set of metatags, and the second page has secondplurality of columns associated with a second set of metatags that isnot identical with the first set of metatags; and providing aclassification index that assists a user in selecting one of the firstand second pages in which to store the at least a portion of the item ofdata; wherein the listing of metatag choices is determined from the setof metatags associated with the selected page, and the column in whichthe at least a portion of the item of data is stored is one of theplurality of columns of the selected page.